Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataflow: Simplify revFlowThrough #18355

Closed

Conversation

smowton
Copy link
Contributor

@smowton smowton commented Dec 20, 2024

Observations:

  • revFlowThrough can be much larger than the other reverse-flow predicates, presumably when there are many different innerReturnAps.
  • It is only ever used in conjunction with flowThroughIntoCall, which can therefore be pushed in, and several of its parameters can thereby be dropped in exchange for exposing arg.
  • revFlowThroughArg can then be trivially inlined.

Result: on repository go-gitea/gitea with PR #17701 producing a wider selection of access paths than are seen on main, revFlowThrough drops in size from ~120m tuples to ~4m, and the runtime of the reverse-flow computation for dataflow stage 4 goes from dominating the forward-flow cost to relatively insignificant. Overall runtime falls from 3 minutes to 2 with substantial ram available, and presumably falls much more under GHA-style memory pressure.

Observations:
* revFlowThrough can be much larger than the other reverse-flow predicates, presumably when there are many different innerReturnAps.
* It is only ever used in conjunction with flowThroughIntoCall, which can therefore be pushed in, and several of its parameters can thereby be dropped in exchange for exposing `arg`.
* `revFlowThroughArg` can then be trivially inlined.

Result: on repository `go-gitea/gitea` with PR github#17701 producing a wider selection of access paths than are seen on `main`, `revFlowThrough` drops in size from ~120m tuples to ~4m, and the runtime of the reverse-flow computation for dataflow stage 4 goes from dominating the forward-flow cost to relatively insignificant. Overall runtime falls from 3 minutes to 2 with substantial ram available, and presumably falls much more under GHA-style memory pressure.
@Copilot Copilot bot review requested due to automatic review settings December 20, 2024 19:21

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

Files not reviewed (1)
  • shared/dataflow/codeql/dataflow/internal/DataFlowImpl.qll: Language not supported

Tip: Copilot only keeps its highest confidence comments to reduce noise and keep you focused. Learn more

Copy link
Contributor

@hvitved hvitved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I have started DCA for all languages, let's wait for the result of that before merging.

@hvitved hvitved added the no-change-note-required This PR does not need a change note label Dec 21, 2024
Comment on lines +2417 to +2419
flowThroughIntoCall(call, arg, p, ap, innerReturnAp) and
revFlowParamToReturn(p, state, pos, innerReturnAp, ap) and
revFlowIsReturned(call, returnCtx, returnAp, pos, innerReturnAp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is disrupting a non-linear join, so it's very likely beneficial to push flowThroughIntoCall further into one of the recursive conjuncts in some way - as-is we're likely having some inefficiency with at least one of the two delta+prev combinations. Also, it would be nice to understand a bit deeper which columns are contributing to the blowup in which way.
We can safely push the projection flowThroughIntoCall(_, _, p, ap, innerReturnAp) into revFlowParamToReturn, which may be beneficial, as that's a pure filter on a pre-non-linear-join conjunct, but we cannot push in the other columns as that would amount to a join with the call-graph a bit too soon (revFlowIsReturned is exactly meant to constrain that part as much as possible).
OTOH, it may very well be good to push flowThroughIntoCall in its entirety into revFlowIsReturned as that already contains the call graph edge. If a project of flowThroughIntoCall to a pure filter in that case turns out yield a beneficial tuple reduction, then the join of revFlowOut and returnFlowsThrough (which occurs in a few places) ought to be revised as flowThroughIntoCall already contains a projected version of returnFlowsThrough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried the two scenarios you're referring to here:

https://github.com/github/codeql/commits/smowton/perf/simplify-rev-flow-through-cand-1 goes for scenario 1, add a filter in revFlowParamToReturn. Dead simple; I'll start a DCA shortly.

https://github.com/github/codeql/commits/smowton/perf/simplify-rev-flow-through-cand-2 goes for scenario 2, "push flowThroughIntoCall in its entirety". However I imagine I've misinterpreted what you want there, since pushing it and adding parameters means we once again have a predicate revFlowIsReturned that materialises the product of the three Aps, which seems to be the source of the trouble.

I'm also not sure what you want re: the join of revFlowOut and returnFlowsThrough: I replaced _s on the one invocation of ReturnFlowsThrough that now coincided with its sibling flowThroughIntoCall, but found only one other use-site that combined revFlowOut and returnFlowsThrough, which seemed difficult to adapt to share anything with the other use-site.

I also found that I can no longer reproduce my original performance problem now that during the release we discovered the performance problem introduced by #18235. That suggests that the extra FP flow introduced via vararg array hairpin routes (e.g. x -> varargs(x, y) -> y) was crucial in the large blowup I was seeing, and this may not in fact be a relevant problem. I think it still likely is the case that revFlowThrough with its three Ap arguments is not ideal to materialise, and the materialisations resulting from this PR's simplifications are better in principle, but the specific motivation to chase this down has gone and may not return if this is too much of a pain to get merged.

@smowton
Copy link
Contributor Author

smowton commented Jan 10, 2025

Experiment for push-in scenario 2: https://github.com/github/codeql-dca-main/issues/25930. Accidentally flipped the two arms of the experiment; note the experimental branch is v0.

@smowton
Copy link
Contributor Author

smowton commented Jan 14, 2025

Scenario 2 seems on the DCA suite vs main slightly deleterious. Again I note I'm only observing the really serious consequences I observed on go-gitea when #18235 is applied, and the consequences relate to relation size rather than time and the accompanying memory pressure on materialisation and/or computation, so I expect the DCA projects are simply too small to show much up. I would recommend a QA run when we decide which variant is the best.

Scenario 1 experiment: https://github.com/github/codeql-dca-main/issues/26035, this time with the arms the correct way around.

@smowton
Copy link
Contributor Author

smowton commented Jan 15, 2025

@aschackmull I re-evaluated this vs. #17701 rebased against latest main, i.e. without the previously-problematic #18235. I found that again one project was a large outlier (+1h eval time) presumably due to lots of new flow (and given the trouble site in the dataflow lib, probably a lot of new flow-through access-path possibilities due to the template/text library now reading arbitrary content.

DCA to confirm that this fix helps in that case too: https://github.com/github/codeql-dca-main/issues/26038. Baseline analysis time for that project in QA is ~9 minutes, analysis time with this fix is ~1h, so I'm expecting to see an intermediate value from that experiment.

That experiment isn't using either of the optimisations you suggested at this point pending feedback re which if either we want to use.

@aschackmull
Copy link
Contributor

Alright, I've been digging into this a bit: To initially get a grip on what's going on here I measured tuple counts on the conjuncts of interest and various joins. Granted, I did this on a fairly innocent query+db, but I think I learned something nonetheless. Pushing projections of flowThroughIntoCall into the two sides yielded only minuscule differences, so that didn't seem very relevant.
Next I considered some of the semantic implications along what you said about the possible culprit being the materialisation of 3 access path columns. Fundamentally, we're tracking (reverse) flow to a parameter with a given access path ap plus a summary context innerReturnAp (ReturnPosition is also part of the summary context, btw.) in revFlowParamToReturn - that's essentially a filter on revFlow reaching a parameter. Then we want to join with a call edge in order to continue flow at the call site, and ap is the crucial access path that we need to preserve to copy to the argument node. But since we're in a through-flow scenario we'll get mismatched call sites and FP flow unless we do this as a summary edge, hence we need to join with the initially taken call edge (joining on the summary context) to get the summary edge in the outer scope. At this point we have 3 access paths in play: the outer summary, and the two access paths on either side of the summary edge (at the argument and at the out node). We've only gotten the call site at that point, which is why we join with flowThroughIntoCall to translate call and p to arg. This is close to functional on its own, so should be harmless, but we've amended flowThroughIntoCall with access path constraints for the summary edge. But since we've just established flow-through with the two access paths on either end of the summary edge, a natural question is to what degree those access paths as included in flowThroughIntoCall bring additional constraints? Put differently, would it affect pruning or semantics if we elided innerReturnAp from flowThroughIntoCall. Such a relaxation would, if sound, would solve this problem, I think, as we could project that column away directly after the non-linear join in revFlowThrough.

@aschackmull
Copy link
Contributor

How about #18515 instead?

@smowton
Copy link
Contributor Author

smowton commented Jan 20, 2025

Replaced by #18515

@smowton smowton closed this Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataFlow Library no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants